A Kernel-Based Case Retrieval Algorithm with Application to Bioinformatics
نویسندگان
چکیده
Case retrieval in case-based reasoning relies heavily on the design of a good similarity function. This paper provides an approach to utilizing the correlative information among features to compute the similarity of cases for case retrieval. This is achieved by extending the dot product-based linear similarity measures to their nonlinear versions with kernel functions. An application to the peptide retrieval problem in bioinformatics shows the effectiveness of the approach. In this problem, the objective is to retrieve the corresponding peptide to the input tandem mass spectrum from a large database of known peptides. By a kernel function implicitly mapping the tandem mass spectrum to a high dimensional space, the correlative information among fragment ions in a tandem mass spectrum can be modeled to dramatically reduce the stochastic mismatches. The experiment on the real spectra dataset shows a significant reduction of 10% in the error rate as compared to a common linear similarity function.
منابع مشابه
An Interior Point Algorithm for Solving Convex Quadratic Semidefinite Optimization Problems Using a New Kernel Function
In this paper, we consider convex quadratic semidefinite optimization problems and provide a primal-dual Interior Point Method (IPM) based on a new kernel function with a trigonometric barrier term. Iteration complexity of the algorithm is analyzed using some easy to check and mild conditions. Although our proposed kernel function is neither a Self-Regular (SR) fun...
متن کاملApplication of a simple likelihood ratio approximant to protein sequence classification
MOTIVATION Likelihood ratio approximants (LRA) have been widely used for model comparison in statistics. The present study was undertaken in order to explore their utility as a scoring (ranking) function in the classification of protein sequences. RESULTS We used a simple LRA-based on the maximal similarity (or minimal distance) scores of the two top ranking sequence classes. The scoring meth...
متن کاملA path following interior-point algorithm for semidefinite optimization problem based on new kernel function
In this paper, we deal to obtain some new complexity results for solving semidefinite optimization (SDO) problem by interior-point methods (IPMs). We define a new proximity function for the SDO by a new kernel function. Furthermore we formulate an algorithm for a primal dual interior-point method (IPM) for the SDO by using the proximity function and give its complexity analysis, and then we sho...
متن کاملA Modified Grasshopper Optimization Algorithm Combined with CNN for Content Based Image Retrieval
Nowadays, with huge progress in digital imaging, new image processing methods are needed to manage digital images stored on disks. Image retrieval has been one of the most challengeable fields in digital image processing which means searching in a big database in order to represent similar images to the query image. Although many efficient researches have been performed for this topic so far, t...
متن کاملChaotic Genetic Algorithm based on Explicit Memory with a new Strategy for Updating and Retrieval of Memory in Dynamic Environments
Many of the problems considered in optimization and learning assume that solutions exist in a dynamic. Hence, algorithms are required that dynamically adapt with the problem’s conditions and search new conditions. Mostly, utilization of information from the past allows to quickly adapting changes after. This is the idea underlining the use of memory in this field, what involves key design issue...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004